Human behavior in Prisoner's Dilemma experiments suppresses network reciprocity 
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During the last few years, much research has been devoted to strategic interactions on complex 
networks. In this context, the Prisoner's Dilemma has become a paradigmatic model, and it has 
been established that imitative evolutionary dynamics lead to very different outcomes depending on 
the details of the network. We here report that when one takes into account the real behavior of 
people observed in the experiments, both at the mean- field level and on utterly different networks 
the observed level of cooperation is the same. We thus show that when human subjects interact 
in an heterogeneous mix including cooperators, defectors and moody conditional cooperators, the 
structure of the population does not promote or inhibit cooperation with respect to a well mixed 
population. 

PACS numbers: 87.23.Kg, 89.65.-s, 89.75.-k, 64.60.Aq 



In recent years, the physics of complex systems has 
widened its scope by considering interacting many- 
particle models where the interaction goes beyond the 
usual concept of force. One such line of research that has 
proven particularly interesting is evolutionary game the- 
ory on graphs [2| , in which interaction between agents 
is given by a game while their own state is described by a 
strategy subject to an evolutionary process [3, 4]. A game 
that has attracted a lot of attention in this respect is the 
Prisoner's Dilemma (PD) [El, a model of a situation 
in which cooperative actions lead to the best outcome 
in social terms, but where free riders or non-cooperative 
individuals can benefit the most individually. In mathe- 
matical terms, this is described by a payoff matrix (en- 
tries correspond to the row player's payoffs and C and 
D are respectively the cooperative and non-cooperative 
actions) 
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with T > 1 (temptation to free-ride) and S < (detri- 
ment in cooperating when the other does not). 

In a pioneering work, Nowak and May [3] showed that 
the behavior observed in a repeated Prisoner's Dilemma 
was dramatically different on a lattice than in a mean- 
field approach: Indeed, on a lattice the cooperative strat- 
egy was able to prevail by forming clusters of alike agents 
who outcompeted defection. Subsequently, the problem 
was considered in literally hundreds of papers [1], and 
very many differences between structured and well-mixed 
(mean- field) populations were identified, although by no 
means they were always in favor of cooperation 



In fact, it has been recently realized that this problem 
is very sensitive to the details of the system 
particular to the type of evolutionary dynamics 
sidered. For this reason experimental input is needed in 
order to reach a sound conclusion about what has been 
referred to as 'network reciprocity'. 

In this Letter, we show that using the outcome from 
the experimental evidence to inform theoretical models, 
the behavior of agents playing a PD is the same at the 
mean field level and in very different networks. To this 
end, instead of considering some ad hoc imitative dynam- 
ics [3, ll2|, [l3| , our players will play according to the strat- 
egy recently uncovered by Grujic et al. [14] in the largest 
experiment reported to date about the repeated spatial 
PD, carried out on a lattice as in Nowak and May's paper 
with parameters T = 1.43 and S = 0. 

The results of the experiment were novel in several 
respects. First, the population of players exhibited a 
rather low level of cooperation (fraction of cooperative 
actions in every round of the game in the steady state), 
hereafter denoted by (c). Most important, however, was 
the unraveling of the structure of the strategies. The 
analysis of the actions taken by the players showed a 
heterogeneous population consisting of "mostly defec- 
tors" (defected with probability larger than 0.8), a few 
"mostly cooperators" (cooperated with probability larger 
than 0.8), and a majority of so-called moody conditional 
cooperators. This last group consisted of players that 
switched from cooperation to defection with probability 
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Ci being the fraction of cooperative actions in player z's 
neighborhood in the previous iteration. Conditional co- 
operation, i.e., the dependency of the chosen strategy on 
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the amount of cooperation received, had been reported 
earlier in related experiments [15] and observed also for 
the spatial repeated PD at a smaller scale [l6[ . The new 
ingredient revealed in Grujic et a/.'s experiment [l4| was 
the dependence of the behavior on the own player's pre- 
vious action, hence the reason to call them "moody". 
Recent experiments about the multiplayer repeated PD 
confirm this observation [17j . 

To study how the newly unveiled rules influence the 
emergence of cooperation in an structured population of 
individuals, we first report results from numerical simu- 
lations of a system made up of N = 10 4 individuals who 
play a repeated PD game according to the experimen- 
tal observations. To this end, we explored the average 
level of cooperation in four different network configura- 
tions: a well- mixed population in which the probability 
that a player interacts with any other one is the same for 
all players, a square lattice, an Erdos-Renyi (ER) graph 
and a Barabasi-Albert (BA) scale-free (SF) network. It 
is worth mentioning that the dependence on the payoff 
matrix only enters through the parameters describing the 
players' behavior (d, 7, a, j3 and the fractions of the three 
types of players). Once these parameters are fixed the 
payoffs do not enter anywhere in the evolution, as this is 
only determined by the variables q, the local fractions of 
cooperative actions within each player's neighborhood. 
Thus there is no possibility to explore the dependence on 
the payoffs because we lack a connection between them 
and the behavioral parameters. 

In Fig. [U we present our most striking result. The fig- 
ure represents, in a color-coded scale, the average level of 
cooperation as a function of the fraction of mostly coop- 
erators, pc, and mostly defectors, pD-, for a BA network 
of contacts. The same plots but for the rest of topologies 
explored (lattice and ER graphs) produce indistinguish- 
able results with respect to those shown in the figure. We 
therefore conclude that the average level of cooperation in 
the system does not depend on the underlying structure. 
This means that, under the assumption that the players 
follow the behavior of the experiment in [ItJ , there is no 
network reciprocity, i.e., no matter what the network of 
contacts looks like, the observed level of cooperation is 
the same. This latter finding is in stark contrast to most 
previous results coming out from numerical simulations 
of models in which many different updating rules — all 
of them based upon the relative payoffs obtained by the 
players — have been explored. 

The previous numerical findings can be recovered us- 
ing a simple mean-field approach to the problem. Let 
the fractions of the three types of players be pc-, Pd and 
px-> for mostly cooperators, mostly defectors, and moody 
conditional cooperators, respectively, with the obvious 
constraint px = 1 — Pd — Pc- Denoting by Pt(A) the co- 
operation probability at time t for strategy A(= C,D,X) 
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FIG. 1: (color online) Density plot of the average level of co- 
operation in the stationary state, (c), as a function of the frac- 
tions of the three strategies (mostly cooperators, C, mostly 
defectors, D, and moody conditional cooperators, X). The 
plot corresponds to a Barabasi-Albert network of contacts 
((k) = 6), but the corresponding plot for an Erdos-Renyi 
graph or a regular lattice is indistinguishable from this one. 
The system is made up of N = 10 4 players and the rest of pa- 
rameters, taken from |l4j], are: d — 0.38, a — 0.15, 7 = 0.62, 
P — —0.1. The thin lines represent the mean- field estima- 
tions [c.f. Eq. ©] for (c) = 0.32, 0.44, 0.56, 0.68. They very 
accurately match the contour lines of the density plot corre- 
sponding to those values of (c), thus proving that the same 
outcome is obtained in a complete graph (mean- field). Simu- 
lation results have been averaged over 200 realizations. 



of the repeated PD we have 

(c) t = PcP(C) + p D P(D) + pxPt(X), 



(2) 



where P t (C) = P(C) and P t (D) = P(D) are known con- 
stants [in our case P(C) = 0.8, P(D) = 0.2]. The prob- 
ability of cooperation for conditional players in the next 
time step can be obtained as 



t+i 



(X) = (d+ 7 (c) t )P t (X) + (a+/3(c)t)[l-P t (X)], (3) 



where the first term in the right hand side considers the 
probability that a conditional cooperator keeps playing as 
a cooperator, whereas the second terms stands for the sit- 
uation in which a moody conditional cooperator switched 
from defection to cooperation. Asymptotically 
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From Eq. ©, 
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l + o-d+C8-7)<c>' 
thus (j2J) implies (with the replacement px = 1 — pc~ Pd) 
Ap c + Bp D = l, (5) 
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FIG. 2: (Color online) Average cooperation level in the sta- 
tionary state, (c), as a function of the density pc of mostly co- 
operators and two different values of the density pD of mostly 
defectors, for two different kinds of networks: regular lattice 
(k = 8), and Barabasi- Albert network ((k) — 8). The net- 
work size is N = 10 4 and the rest of parameters are as in 
Fig. [1] Lines represent the mean-field estimations. Results 
are averages over 200 realizations. The inset is a zoom that 
highlights how the different curves compare. 



where 
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are functions of (c). From Eq. (j5j) it follows that the 
curves of constant (c) are straight lines in the simplex. 
Figure Q] clearly demonstrates this fact: The straight lines 
are plots of Eq. (|5j) for different values of (c) . It can be 
seen that they are parallel to the color stripes, and that 
the values of (c) they correspond to accurately fit those 
of the simulations. Figure [2] depicts the curve (c) vs. pc 
for two different values of as obtained from Eq. (J5j) 
and compared to simulations. This figure illustrates the 
excellent quantitative agreement between the mean-field 
result and the simulation results. The match between 
the analytical and numerical results is remarkable, as it 
is the fact that the result does not depend on the under- 
lying topology. This is the ultimate consequence of the 
lack of network reciprocity: the cooperation level on any 
network can be accurately modeled as if individuals were 
playing in a well-mixed population. 

The steady state is reached after a rather short tran- 
sient, as illustrated by Figure EH This figure compares the 
approach of the cooperation level to its stationary state 
as obtained iterating Eq. (j3j) and from numerical simula- 
tions on different networks with different sizes. The ini- 
tial cooperation level has been set to (c)o = 0.592, close 



to the value observed in the experiment of Ref. 14]. The 
transient does exhibit a weak dependence on the under- 



FIG. 3: (Color online) Time evolution of the cooperation level 
until the stationary state is reached. The results have been 
obtained from numerical simulations on different networks 
with different sizes. The Mean-Field curve is the solution of 
Eq. ©. P(C) = 2/3, P(D) = 1/3, P(X;t = 0) = 1, (k) = 8, 
p D = 0.586, pc = 0.053, d = 0.345, a = 0.224, 7 = 0.64, 
P — —0.072. Averages have been taken over 10 3 realizations. 



lying topology and specially on the network size, but for 
the largest simulated size (N = 10 4 ) the curves are all 
very close to the mean-field prediction. 

The only observable on which the topology does have 
a strong effect is the payoff distribution among players. 
Figure [4] shows these distributions for the three studied 
topologies, and at two different times — short and long. 
Smooth at short times, this distribution peaks around 
certain values at long times. This reflects the fact that 
payoffs depend on the number of neighbors of different 
types around a given player, which yields a finite set of 
values for the payoffs (the centers of the peaks). These 
numbers occur with different probabilities (determining 
the height of the peaks), according to the distribution 



Q(k) = £ 



fc>l 



k 

k c k D 



p k c c p k D D p k x x p(k), 



(7) 



where p(k) is the degree distribution of the network and 
k = (kc,kz>,kx), but it is understood that kx = k — 
kc — kjj. The standard convention is assumed that the 
multinomial coefficient ( kc k kD ) = whenever kc < 0, 
k D < or kx < 0. 

The approach to a stationary distribution of payoffs 
exhibits a much longer transient. This is due to the fluc- 
tuations in the payoffs arising from the specific actions 
(cooperate or defect) taken by the players. These fluc- 
tuations damp out as the accumulated payoffs approach 
their asymptotic values. Thus, the peak widths shrink 
proportionally to t~ x l 2 . In fact, one can show that the 
probability density for the distribution of payoffs Ft for 
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FIG. 4: (Color online) Distribution of the pay-off per neigh- 
bor in the stationary state for different network topologies: 
regular lattice (k = 8), Erdos Renyi ((k) = 8) and Barabasi- 
Albert network ({k) = 8). Black and blue lines represent the 
results of numerical simulations for two values of time: t — 10 
(black shallow curves) and t — 10 4 (blue, thick line curves) 
while red lines represent the theoretical estimations for the 
density probabilities at t = 10 4 , as obtained from Eq. dH}. 
N = 10 4 , pD = 0.586, pc = 0.053, and other parameters 
are as in Fig. [T] The simulation results are averages over 10 3 
realizations. 

strategy Z can be approximated as 

W Z (U) = ^ - ^)Mk), y/ta k (Z)a(k))Q(k), 
k>i 

(8) 

where G(x,j) = (2i\^ 1 )~ 1 l 2 e~ x I 21 , the mean payoff 
per neighbor received by a Z strategist against a cooper- 
ate is 

a k {Z)= 1 l {P{Z)+T[l-P{Z))}, 

with k = kc + kp + kx, and the average cooperation level 
in the neighborhood of the focal player and its variance 
are 



The approximate total payoff distribution, 1^(11) = 
PcWc(~R) + PdWd(~H-) + PxWx01), is compared in Fig.H 
with the results of the simulations for the longest time. 

Summarizing, in this work we have shown both an- 
alytically and through numerical simulations that if we 
take into account the way in which humans are experi- 
mentally found to behave when facing social dilemmas on 
lattices, no evidence of network reciprocity is obtained. 
In particular, we have argued that if the players of a Pris- 
oners' Dilemma adopt an update rule that only depends 
on what they see from their neighborhood, then cooper- 
ation drops to a low level — albeit nonzero — irrespective 
of the underlying network. Moreover, we have shown 
that the average level of cooperation obtained from sim- 
ulations is very well predicted by a mean-field model, 
and it is found to depend only on the fractions of dif- 
ferent strategists. Additionally, we have also shown that 
the underlying network of contacts does manifest itself in 
the distribution of payoffs obtained by the players, and 
has a slight influence on the transient behavior. 

To conclude, it is worth mentioning that our results 
only make sense when applied to evolutionary game mod- 
els aimed at mimicking human behavior in social dilem- 
mas. The independence on the topology seems to reflect 
the fact that humans update their actions according to a 
rule that ignores relative payoffs. Interestingly, absence 
of network reciprocity has also been observed in numeri- 
cal simulations using best response dynamics [18[ , an up- 
date rule widely used in economics that does not take into 
account the neighbors 's payoffs. This suggests that the 
result that networks do not play any role in the repeated 
PD may be general for any dynamics that does not take 
neighbors' payoffs into account. We want to stress that 
the same kind of models thought of in a strict biologi- 
cal context are ruled by completely different mechanisms 
which do take into account payoff (fitness) differences. 
Therefore, in such contexts lattice reciprocity does play 
its role. In any case, our results call for further exper- 
iments that uncover what rules are actually governing 
the behavior of players engaged in this and other social 
dilemmas. 
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/i(k) = k c P(C)+k D P(D)+k x P(X), 
a(k) 2 ee k c P(C)[l-P(C)]+k D P(D)[l-P(D)] 
+k x P(X)[l-P(X)]. 
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